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A fixed-point algorithm has been used to obtain the parameters (i.e., 
decision and representative levels) of an "optimum" quantizer that mini- 
mizes a quite general distortion measure, subject to an entropy constraint 
on its output. Construction of the algorithm starts with a point-to-set 
mapping whose fixed point satisfies the well-known Karush-Kuhn-Tucker 
conditions necessary for a local extremum. A computer program is then 
used to determine a fixed point of this mapping. Several examples are 
solved, and correspondence with the existing results in the literature is 
pointed out. Finally, as conjectured, the growth of the computations as a 
function of dimensionality n (n: number of representative levels) is found 
to be of the form a-n b where a is a positive constant and 1.5 ^ b ^ 2.0. 

I. INTRODUCTION 

Simple quantization 1-3 has been and continues to be a popular 
method of digitizing analog signals. The relative ease with which 
quantizers can be implemented in hardware and their near optimum 
performance has made them withstand the challenge from several 
new coding schemes. 4-6 Universal use of quantizers has naturally 
spurred a significant activity in optimizing their performance, some 
of which is summarized in the next few paragraphs. Our objective 
in this paper is to show how the problem of obtaining the parameters 
of an optimum quantizer can be converted to the problem of obtaining 
fixed points of a suitably constructed mapping and then to use a 
fixed-point algorithm to solve the problem numerically. 

Quantizers have been optimized based on several criteria. In order 
to discuss these in relation to the problem considered in this paper, we 
describe the basic quantizer equations. Given a scalar random variable 
T with probability density p(t), a quantizer Q is a map Q(t) = ?/, 
whenever Xi ^ t < Xi + i, where x*, i = 1, • • •, N + 1 and ?/,•, i = 1, 
• ■ •, N are the decision and representative levels of the quantizer, 
respectively. The performance of the quantizer is judged generally in 
terms of two quantities : 
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the distortion 

N rxi+i 



D= E / ' git - yd x f(t)dt, (l) 

i=l Jxi 

and the entropy 



A r 



S = - E (log 2 p.) X p,-, (2) 



i=i 



where g is a nonnegative function and / is a nonnegative weighting 
function that weights the quantization noise and 



rxi+i 

Pi = / p(t)dt. 

J Xi 



Optimum quantizers choose their parameters {xi\ , i = 2, • • • , N and 
{yi\,i = 1, • • •, N (given the end points X\, xn+i) to optimize a certain 
combination of D and 8. 

Most quantization literature uses the weighting function / to be the 
same as the probability function p, although in some applications 7-9 a 
different weighting function performs better. Most of the earlier work 
is concerned with minimizing D for a given number of levels. Panter 
and Dite 10 have used g(-) = | (•) \ r (r > 0) and obtained an approxi- 
mate optimum quantizer as one in which each of the quantizing in- 
tervals \_Xi, Xi+i] makes an equal contribution to the integral of 
I (t ~ yd I r - This allowed them to choose the quantizer parameters for 
large N. Lloyd 11 and Max 12 have developed an algorithm for r = 2, 
which corresponds to minimizing the mean square error. Bruce 13 has 
used dynamic programming to solve the same problem in slightly more 
generality by taking a general function g(-). Simpler suboptimal algo- 
rithms and bounds on the performance of the quantizers have been 
obtained by Roe, 14 Algazi, 15 and Zador. 16 

Representation of the quantizer output by a variable length code 
allows reduction of the average bit rate of the quantizer when pi varies 
with i. Use of Huffman code 17 makes the average bit rate approach 
the entropy of the quantizer output. Thus, the problem of designing 
an optimum 18-19 quantizer can be reformulated as that of obtaining the 
decision and representative levels to minimize D subject to a constraint 
on the entropy. Goblick and Holsinger have considered this problem 
for uniform quantizers and have concluded that for gaussian density, 
for r = 2, and for the same distortion, the entropy of the output of the 
uniform quantizer is higher than the theoretical lower bound based on 
the rate distortion theory by about | bit. Uniform quantizers are also 
good in an asymptotic sense, since they are optimum for a large number 
of levels. 21 Moreover, for Laplacian densities, as shown by Berger, 20 
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uniform quantizers are optimum for any value of entropy. A different 
type of distortion measure has been considered by Elias. 22 

The problem we consider is that of obtaining the parameters of 
quantizers such that D is minimized for a given constraint on the 
entropy. Although the approach taken here is suitable for a general 
distortion measure of eq. (1), we consider only the case of g( • ) = ( • ) 2 , 
mainly to compare our results to those in the literature. In the next 
section, we present the necessary conditions that the optimum quan- 
tizer must satisfy for a local extremum. Then, in Section III, we con- 
struct a point-to-set mapping such that its fixed point satisfies the 
necessary conditions for a local extremum of our problem. A descrip- 
tion of the algorithm is then presented for completeness. In Section IV, 
we present the results of use of this algorithm for uniform, Laplacian, 
and gaussian densities. The distortion-entropy curves are presented for 
each case. We also present a surprising observation on the growth of 
computations as a function of dimensionality (i.e., the number of 
quantizer parameters to be optimized). 

II. FORMULATION OF THE PROBLEM AND NECESSARY CONDITIONS 

Using g(-) = (-) 2 > the distortion of eq. (1) becomes 

D - £ f' +1 (t - ytffWt. (3) 

j = l Jxj 

Then the problem is to obtain \xj), j = 2, • • •, N, {y,}, j = 1, • ■ •, N 
such that they minimize D subject to & ^ K, for a given N. The 
necessary conditions from the Karush-Kuhn-Tucker theory 23 are that 
there exists a X ^ such that 

VD(x) + \VS(x) = 0, (4) 

where a: is a vector of quantizer parameters and V denotes the gradient. 
For the parameters {?/>}, since S is independent of \yj\, (4) becomes 



Vj = J tf{t)dt/ J x f(t)dt, j - 1, • • -, N. 



(5) 



This implies that the representative levels can be obtained explicitly 
by knowing the decision levels and therefore they do not add to the 
dimensionality of the problem. Also, the other necessary conditions are 

& ^ K 
and 

\(£ - K) = 0. (6) 

III. FIXED-POINT APPROACH 

In this section, we formulate the quantization problem as a fixed- 
point problem and give a general description of the algorithm that 
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solves this problem. This algorithm is based on the theory of com- 
plementary pivoting. 24 

Given a point-to-set mapping r [i.e., to each point x in R n it as- 
sociates a subset T(x) of R*]t a fixed point of such a mapping is a point 
x such that x £ T(x). We show that the problem of finding the pa- 
rameters of the optimal quantizer can be formulated as a problem of 
finding a fixed point of a certain point-to-set mapping. 

3.1 Fixed-point formulation 

Let VD and V8 be the gradient vectors of the distortion D and 
entropy 8, respectively. Then, consider the following point-to-set 
mapping : 

[x- {VD(x)\ 8(x) < K 

T(x) = \ x - hull (VD(x), V8(x)\ 8{x) = K, (7) 

[x - {VS(x)\ S(x) > K 

where hull \E] is the smallest convex set containing E; i.e., the convex 
hull of E, and x — A = {x — y.y £ A } for a set A in R n . Note that 
the mapping as defined is upper semicontinuous* (u.s.c.) and the set 
T (x) is convex for each x. As we subsequently see, these properties are 
needed if the algorithm is to find a fixed point of I\ 

We now show that a fixed point of this mapping satisfies the necessary 
conditions of Section 2. 

Theorem: Let x £ T{x). Then, if S(x) ^ K, x satisfies the necessary 
conditions of Section 2. Otherwise, x is a local minimizer of S(x). 

Proof: We construct the required X and show that (6) is satisfied. 
Since x £ F(x) and & (x) ^ K, we have two cases: 

Case (i): S(x) < K. Let X = and, since £ \VD(x)}, VD(x) 
+ XV8(x) = 0, satisfying (6). Note that X[S(z) - K~\ = 0. 

Case (ii): S(x) = K. Then, as £ hull {VD(x), V8(x)\, there exist 
Xi + X 2 = 1, Xi ^ 0, X 2 ^ such that 

XiVZKz) + X 2 V£(x) = 0. (8) 

Now, in case Xi ^ 0, letting X = X 2 /Xi ^ 0, (4) is satisfied, and 
X[5(x) — K~] = 0. In the contrary case, a constraint qualification 
would be violated. 

In case x £ T(x) and 8(x) > K, then, since £ \VS{x)\, V8(x) 
= and we have a local minimizer of 8{x). If 8{x) were a convex 
function, our problem has no feasible solution [i.e., an x such that 



*A mapping r is u.s.c. if, for any two sequences \x k ), \y k \ such that x*-*x, 
y k ^ r(a;*), and y k —* y, we have y £ r(x). 
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&{x) ^ K~], In the contrary case, we would conclude the algorithm 
has failed. 

3.2 Description of the aigorlthm 

In this section we give a brief description of the algorithm that com- 
putes fixed points of point-to-set mappings. Before going into the 
details of the algorithm, we introduce some notation. 

Given a set C in R n , and a point-to-set mapping r, by T(C) we 

represent the set U T(x). Also, given a one-to-one linear mapping r, 

xec 
we say a set C is 

0) r— complete if G hull {T(C)}, 
(ii) r — complete if £ hull {r(C)\, and 
(Hi) T U r— complete if G hull {T(C) \J r(C)\. 

The significance of T-complete sets is the following: in case r is 
u.s.c. and T(x) is convex for each x, a sequence C„ i = 1, 2, • • • of 
T-complete sets whose diameter approaches as i approaches °° , con- 
verges to a fixed point of r (see, for example, Refs. 25-27). The fixed- 
point algorithms are designed to find such a sequence of T-complete 
sets. 

These algorithms work with sets C that are simplexes of appropriate 
dimension. (An n-dimensional simplex is a convex body obtained by 
taking the convex hull of n + 1 affinely independent points in n-space. 
A two-dimensional simplex is a triangle; a three-dimensional simplex 
is a tetrahedron.) They start with a unique r-complete simplex and 
generate a sequence of T {J incomplete simplexes that terminate with 
a T-complete simplex. There are essentially two basic algorithms that 
can be used to generate a sequence of T-complete simplexes of decreas- 
ing diameters. They are the restart method of Merrill 27 and the con- 
tinuous deformation method of Eaves and Saigal. 26 A study of both these 
methods can be found in Saigal. 28-30 

We now discuss an application of the algorithm. A real number 
d > is chosen. Then the space R" X [0, d~\ is triangulated (i.e., each 
point in the space lies in an (n + 1) -dimensional simplex, and these 
simplexes overlap only on their boundaries) such that the vertices of 
the triangulation are only in the set R n X {d/2 k \, k — 0, 1, • • •. In 
addition, the diameter of each ?*-dimensional face of each (n + 1)- 
dimensional simplex that lies inii"X [c//2* +1 , d/2*] is at most d/2 k . 
Now, an arbitrary starting point x is chosen. We then define 

r(x) = — x + x , (9) 

which is a one-to-one linear mapping. 

The sequence of T \J r-complete simplexes is then generated as 
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follows : 

Step 1 : Start with an r-complcte simplex in the triangulation that 
contains (x , d). The triangulation is arranged in such a way 
that there is a unique such simplex, and that this simplex 
has exactly one vertex in R n X {d/2\, and (n + 1) vertices 
in R n X {d}. The entering vertex is the one in R n X \d/2). 
Design the labeling function L on the vertices of the triangu- 
lation with 

r , jS \ z — x for some z E r(z) if t < d , inN 
L(x,t)= i [x0 _ x if t = d (10) 

Step 2 : Find the label on the entering vertex. 

Step 3 : Find a new r \J r-complete simplex that includes the enter- 
ing vertex, in place of some vertex of the older simplex. 
This is equivalent to the basic pivot operation of the simplex 
method. 31 

Step 4-' Find the other (n + 1) -dimensional simplex that contains 
the new r U r-complete simplex found in Step 3, and de- 
termine the entering vertex. 

Step 5: If the entering vertex is outside R n X [d/2 K , d), stop. The 
earlier r U r-complete simplex is actually r-complete. 
Otherwise, go to Step 3. 

Having found a r-complete simplex t, say, whose vertices are V 1 , 
V 2 , ••-, V n+l , where V i = {v\ dt), i = 1, • • •, n + 1, we have de- 
termined points 2' (E r(v') and a X = (Xi, ■ ■ ■, \ B +i) = such that 



n+l 

E M* = o 

»=i 

n+l 

E x,- l 

t=i 



(ID 



has a solution. In this case, we say that the point x determined by 

x = E Xtf (12) 

»=i 

is an approximate fixed point (for justification, see Ref. 26). 

Since the stopping criterion at Step 5 requires that we generate a 
vertex in R n X {d/2 K \, we have generated a sequence of r-complete 
sets Ci, the last one of diameter less than d/2 K , and have thus found a 
reasonable solution. 

The procedure for triangulation R n X (0, d] generally used is 
called JS in the literature. For a more detailed description of this 
algorithm, the reader is referred to Ref. 32. 
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IV. EXAMPLES 

In this section, we discuss some examples that we solved using the 
algorithm described in the previous section. Three of the four examples 
had /(•) = p(-) corresponding to mean square quantization error as 
a measure of distortion. The fourth example, on the other hand, uses 
a different weighting of the quantization noise; it is motivated by the 
problem of quantizer design for simple element differential coding of 
picture signals. 9 The examples are: 

(i) f{x) = p(x) = -h, -16 =g x =£ +16 
= otherwise 

1 



(ii) f(x) = p(x) = - e -a|11 , — co <x<+co,o! = 0.1 
a 

/■■■\ t( \ f \ exp( — v?/2a) . . , -. 

(in) f(x) = p(x) = , : — - , — »<a;<+oo,a=l 

■\2ira 

(iv) f(x) = - e-M xl ; p(x} = - e~ a| *', — « < x < + co 
a a 



(13) 



a = 0.18, = 0.1; 



and 



a = 0.1, /3 = 0.065. 



Due to symmetry of functions /(•) and p(-), the optimum quantizers 
are symmetric and, for simplicity therefore, quantizers were con- 
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Fig. 1 — Quantizer performance for uniform density. Minimum mean square error 
(mmse) is plotted against entropy for a fixed number of levels (N). Only odd-level 
quantizers are considered. For each fixed number of levels, mmse decreases with 
entropy up to a certain point, after which there is no further decrease in mean square 
error. 
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Fig. 2 — Quantizer performance for Laplacian density. 

strained to be symmetric. Also, without loss of any generality, only 
quantizers having odd numbers of levels were considered. In each case, 
several problems were solved by varying the entropy constraint and 
the number of levels. The number of levels were varied from 3 to 21, 
and the entropy constraint was varied from 1.0 bit to the largest 
possible bits using a particular number of levels. 

Results of these simulations are given in Figs. 1 through 5. In these 
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Fig. 3 — Quantizer performance for gaussian density. 
1430 THE BELL SYSTEM TECHNICAL JOURNAL, NOVEMBER 1976 




N = 13 
*— X— X-N=15 

N = 17 
,--K N = 19 



N = 21 



2.0 2.5 

ENTROPY (BITS) 



3.0 



3.5 



Fig. 4 — Quantizer performance for Laplacian density and exponential weighting. 
Quantization noise is weighted by 1/|0| exp(— /3|x|), whereas the probability 
density is taken to be l/\a\ exp (— a|x|). Such situations arise in quantization of 
the prediction errors in predictive coding of the television signals : a = 0.1, /S = 0.065. 

figures the distortion is plotted logarithmically on y-axis and the 
entropy is plotted linearly on x-axis in bits. Alternate solid and broken 
lines are shown for different values of quantizer levels. For a given 
number of levels, the minimum distortion decreases approximately 
exponentially with respect to the entropy up to a certain point and 
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Fig. 5 — Quantizer performance for Laplacian density and exponential weighting : 
a = 0.18, = 0.1. 
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then the entropy constraint is not operative any longer, and conse- 
quently the distortion remains a constant. These are indeed Lloyd- 
Max 11 - 12 quantizers that minimize the distortion for a given number of 
levels with no constraint on the entropy. The distortion versus entropy 
curves are lower bounded by the following functions : 

Example (1) D = 85.3 exp(-1.39#) 

Example (2) D = 196.18 exp(- 1.Z2E) 

Example (3) D = 1.40 exp(-1.39J5) (14) 

Example (4a) D = 62.50 exp(- 1ME) 

Example (4b) D = 176.71 exp(-1.24#). 

In the case of uniform densities, the optimum quantizer is non- 
uniform whenever the entropy constraint is operative, but when the 
entropy constraint is too large and inoperative, the optimum quantizers 
are uniform. Laplacian densities, on the other hand, always have uni- 
form quantizers as the optimum quantizers. This has been shown by 
Berger. 20 In the case of gaussian density, the optimum quantizers were 
not uniform ; however, a comparison of our results with those given by 
Goblick and Holsinger 18 indicates that, although nonuniform quantizers 
perform better than uniform quantizers, the differences in the per- 
formance of the two are somewhat small. This conclusion has also been 
reached by Wood 19 and Berger. 20 The case of an exponential weighting 
function falling slower than the probability density function arises in 
quantization of the prediction error in a simple element differential 
coding of picture signals. In this case, the density of the prediction 
error is approximately Laplacian, whereas the perceptual visibility 9,33 
of the quantization noise may be approximated by an exponential 
function decaying somewhat slower than the probability density. The 
distortion-entropy curves for this case show larger improvement (that 
is, for a given entropy the distortion decreases much more than in the 
previous examples) as the number of levels is increased. Also, the 
optimum quantizers are nonuniform. Improvement in their perform- 
ance over that of the uniform quantizers is more significant than in the 
previous examples. It is interesting to note that our algorithm can 
solve Lloyd-Max problem trivially by setting the entropy constraint 
to a very high value. This algorithm was also used in other applica- 
tions related to adaptive quantization 34 of picture signals. The prob- 
lems in this case were such that they had uniform (constant) weighting 
functions and two-sided exponentials as the density functions. The 
resulting quantizers had interesting structure and were used quite 
successfully. 
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4.1 Computational effort as a function of n 

The increase in computational effort as a function of dimension, n, 
of the problem is important in the study of algorithms. In the case of 
fixed-point algorithms, Saigal 28 had speculated that different tri- 
angulations would have different effects on this growth and he was the 
first to propose a measure to describe it. For the triangulation em- 
ployed in our experimentation, his measure predicted the growth rate 
of the number of iterations as n 2 . Subsequently, Todd 35 refined his 
measure to predict an "average" growth rate of the iterations as v}. 
The measure of Saigal, in some sense, predicts the "worst case" 
behavior. 

The computational experiments in Section IV were ideally suited 
to test the theoretical predictions of Refs. 28 and 35, since the dimen- 
sion of the problem was increased in a regular manner, the starting 
points were chosen in a regular way, and the problems of dimensions 
varying between 1 and 10 were solved. A number of results for various 
entropy values were plotted on the log-log paper. A representative 
plot is given in Fig. 6. It is seen that the experimental points lie on a 
straight line. The slope of these lines for different cases was a function 
of the entropy constraint and the probability density used and varied 
from 1.55 to 1.88, which is between 1.5 predicted by Todd 35 and 2 
predicted by Saigal. 28 

Thus, we can conclude, with a high degree of certainty, that the 
number of iterations of the algorithm to solve a problem of dimension 



20,000 




3 4 

DIMENSIONALITY 



Fig. 6 — Growth of computations vs dimensionality. Number of representative 
levels N is two times the dimensionality n plus 1. Straight line drawn is the minimum 
mean square error fit to the observations shown by A. 
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n would require an b for some b between 1.5 and 2.0. Since each iteration 
requires 0(?i 2 ) multiplications and at most one evaluation of the 
function, the number of function evaluations is bounded by 0(?i 2 ) and 
multiplications by 0(n 4 ). 

V. CONCLUSIONS 

A fixed-point formulation has been developed to minimize the dis- 
tortion, using a fairly general distortion measure, with respect to pa- 
rameters of a quantizer under an entropy constraint on the quantized 
output. A point-to-set mapping is first developed whose fixed point 
satisfies the necessary conditions for a local extremum. Then a com- 
puter program is developed to compute its fixed points. Several ex- 
amples are solved to show the usefulness of the algorithm. Finally, the 
rate of growth of the computations used by the algorithm as a function 
of the dimensionality of the problem is also discussed. 
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